Active Learning for Dependency Parsing with Partial Annotation
نویسندگان
چکیده
Different from traditional active learning based on sentence-wise full annotation (FA), this paper proposes active learning with dependency-wise partial annotation (PA) as a finer-grained unit for dependency parsing. At each iteration, we select a few most uncertain words from an unlabeled data pool, manually annotate their syntactic heads, and add the partial trees into labeled data for parser retraining. Compared with sentence-wise FA, dependency-wise PA gives us more flexibility in task selection and avoids wasting time on annotating trivial tasks in a sentence. Our work makes the following contributions. First, we are the first to apply a probabilistic model to active learning for dependency parsing, which can 1) provide tree probabilities and dependency marginal probabilities as principled uncertainty metrics, and 2) directly learn parameters from PA based on a forest-based training objective. Second, we propose and compare several uncertainty metrics through simulation experiments on both Chinese and English. Finally, we conduct human annotation experiments to compare FA and PA on real annotation time and quality.
منابع مشابه
Combining Active Learning and Partial Annotation for Japanese Dependency Parsing
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows how active learning can be used for domain adaptation of d...
متن کاملCombining Active Learning and Partial Annotation for Domain Adaptation of a Japanese Dependency Parser
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows that active learning can be used for domain adaptation of ...
متن کاملAnnotation Projection-based Representation Learning for Cross-lingual Dependency Parsing
Cross-lingual dependency parsing aims to train a dependency parser for an annotation-scarce target language by exploiting annotated training data from an annotation-rich source language, which is of great importance in the field of natural language processing. In this paper, we propose to address cross-lingual dependency parsing by inducing latent crosslingual data representations via matrix co...
متن کاملActive Learning for Dependency Parsing Using Partially Annotated Sentences
Current successful probabilistic parsers require large treebanks which are difficult, time consuming, and expensive to produce. Some parts of these data do not contain any useful information for training a parser. Active learning strategies allow to select the most informative samples for annotation. Most existing active learning strategies for parsing rely on selecting uncertain sentences for ...
متن کاملUsing Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
We investigate active learning methods for Japanese dependency parsing. We propose active learning methods of using partial dependency relations in a given sentence for parsing and evaluate their effectiveness empirically. Furthermore, we utilize syntactic constraints of Japanese to obtain more labeled examples from precious labeled ones that annotators give. Experimental results show that our ...
متن کامل